Lexical and Algorithmic Stemming Compared for 9 European Languages with Hummingbird SearchServerTM at CLEF 2003
نویسنده
چکیده
Hummingbird participated in the monolingual information retrieval tasks of the Cross-Language Evaluation Forum (CLEF) 2003: for natural language queries in 9 European languages (German, French, Italian, Spanish, Dutch, Finnish, Swedish, Russian and English) find all the relevant documents (with high precision) in the CLEF 2003 document sets. For each language, SearchServer scored higher than the median average precision on more topics than it scored lower. In a comparison of experimental SearchServer lexical stemmers with Porter’s algorithmic stemmers, the biggest differences were for the languages in which compound words are frequent (German, Dutch, Finnish and Swedish). SearchServer scored significantly higher in average precision for German and Finnish, apparently from its ability to split compound words and find terms when they are parts of compounds in these languages. Most of the differences for the other languages appeared to be from SearchServer’s lexical stemmers performing inflectional stemming while the algorithmic stemmers often additionally performed derivational stemming; these differences did not pass a significance test.
منابع مشابه
European Ad Hoc Retrieval Experiments with Hummingbird SearchServerTM at CLEF 2005
Hummingbird participated in the 4 monolingual information retrieval tasks (Bulgarian, French, Hungarian and Portuguese) of the Ad-Hoc Track of the Cross-Language Evaluation Forum (CLEF) 2005. In the ad hoc retrieval tasks, the system was given 50 natural language queries, and the goal was to find all of the relevant documents (with high precision) in a particular document set. We conducted diag...
متن کاملEuropean Web Retrieval Experiments with Hummingbird SearchServer™ at CLEF 2005
Hummingbird participated in the mixed monolingual retrieval task of the WebCLEF Track of the Cross-Language Evaluation Forum (CLEF) 2005. In this task, the system was given 547 known-item queries from 11 languages (134 Spanish, 121 English, 59 Dutch, 59 Portuguese, 57 German, 35 Hungarian, 30 Danish, 30 Russian, 16 Greek, 5 Icelandic and 1 French). The goal was to find the desired page in the 8...
متن کاملExperiments in 8 European Languages with Hummingbird SearchServer™ at CLEF2002
Hummingbird submitted ranked result sets for all Monolingual Information Retrieval tasks of the Cross-Language Evaluation Forum (CLEF) 2002. Enabling stemming in SearchServer increased average precision by 16 points in Finnish, 9 points in German, 4 points in Spanish, 3 points in Dutch, 2 points in French and Italian, and 1 point in Swedish and English. Accent-indexing increased average precisi...
متن کاملExperiments in Named Page Finding and Arabic Retrieval with Hummingbird SearchServerTM at TREC 2002
Hummingbird participated in the named page finding task of the TREC 2002 Web Track (find the named page in 18GB from the .GOV domain) and the monolingual Arabic topic relevance task of the TREC 2002 Cross-Language Track (find all relevant documents in 869MB of Arabic news data). In the named page finding task, SearchServer returned the named page in the first 10 rows for more than 80% of the 15...
متن کاملCJK Experiments with Hummingbird SearchServerTM at NTCIR-5
Hummingbird submitted ranked result sets for the Chinese, Japanese and Korean Single Language Information Retrieval subtasks of the Cross-Lingual Information Retrieval Task of the 5th NII-NACSIS Test Collection for IR Systems Workshop (NTCIR-5). For short Chinese (title) queries, a decompounded wordbased approach produced higher (statistically significant) mean average precision and first relev...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003